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ABSTRACT 

Context. Most observational results on the high redshift restframe UV-bright galaxies are based on samples pinpointed using the 
so-called dropout technique or Ly-o- selection. However, the availability of multifilter data now allows the dropout selections to be re¬ 
placed by direct methods based on photometric redshifts. In this paper we present the methodology to select and study the population 
of high redshift galaxies in the ALHAMBRA survey data. 

Aims. Our aim is to develop a less biased methodology than the traditional dropout technique to study the high redshift galaxies in 
ALHAMBRA and other multifilter data. Thanks to the wide area ALHAMBRA covers, we especially aim at contributing to the study 
of the brightest, least frequent, high redshift galaxies. 

Methods. The methodology is based on redshift probability distribution functions (zPDFs). It is shown how a clean galaxy sample 
can be obtained by selecting the galaxies with high integrated probability of being within a given redshift interval. However, reaching 
both a complete and clean sample with this method is challenging. Hence, a method to derive statistical properties by summing the 
zPDFs of all the galaxies in the redshift bin of interest is introduced. 

Results. Using this methodology we derive the galaxy rest frame UV number counts in five redshift bins centred at z = 
2.5,3.0,3.5,4.0, and 4.5, being complete up to the limiting magnitude at m uv (AB)=24, where m uv refers to the first ALHAMBRA 
filter redwards of the Ly -a line. With the wide field ALHAMBRA data we especially contribute to the study of the brightest ends of 
these counts, accurately sampling the surface densities down to /i7uv(AB)=21-22. 

Conclusions. We show that using the zPDFs it is easy to select a very clean sample of high redshift galaxies. We also show that 
it is better to do statistical analysis of the properties of galaxies using a probabilistic approach, which takes into account both the 
incompleteness and contamination issues in a natural way. 
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1. Introduction 

Identifying and studying high redshift galaxies is crucial for 
our understanding of the early epochs of galaxy evolution. At 


* Based on observations collected at the German-Spanish 
Astronomical Center, Calar Alto, jointly operated by the Max- 
Planck-Institut fur Astronomie (MPIA) at Heidelberg and the Instituto 
de Astroftsica de Andalucta (CSIC) 


the beginning of the nineties, the implementation of the so- 
called dropout technique opened the era for detections of co¬ 
pious numbers of these early ga l axies ( e.g. Guhathakurt a~et~ail 
1990: ISteidel fe Hamiltonl[l992[ fl99.3: St eidel et al.ll1996allhh . 


These galaxies are discovered based on their broadband colours, 
i.e. by measuring the drop in brightness due to the Lyman break 
at rest frame 912 A and/or the Lyman forest between 912 A and 
1216 A. For high redshift galaxies (z. > 2) these features are 
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detected at optical or infrared wavelengths and permit the detec¬ 
tion of these so-called Lyman-break galaxies (LBGs) from the 
ground. The dropout technique is sensitive to galaxies that are 
young enough to produce copious amounts of ultraviolet light, 
and are sufficiently dust free for a fair amount of this light to 
escape the galaxy. 

Detections of high redshift galaxies opened the possibil¬ 
ity for observational studies of some fundamental questions of 
galaxy evolution and cosmology at early epochs. One of the 
most widely studied properties are the LBG rest frame ultraviolet 
(UV) luminosity functions. The UV luminosities of the galaxies 
(once corrected for dust extinction) are directly proportional to 
their star formation rates. Hence, the study of the UV luminosity 
density, derived by integrating the luminosity function at differ¬ 
ent redshifts, gives information about the star formation history 
in the Universe. 


Lyman-break galaxies can also act as tracers of dark mat¬ 
ter at high redshift through the study of their clustering proper¬ 
ties. The formation history of galaxies is basically understood 
through two fundamental evolutionary processes, i.e. the pro¬ 
duction of stars and the accumulation of dark matter. While the 
baryonic matter, i.e. stars, gas, and dust, can be studied through 
the light they emit, the dark matter cannot be directly detected 
using electromagnetic waves. However, the clustering properties 
of galaxies are closely related to the distribution and amount of 
the underlying dark matter (see lOuchi et al.ll2004bi. and refer¬ 
ences therein). 


Most of these studies, up to the very recent ones, 
have applied the dropo u t technique for c andidate selection 
(e.g. Ouchi et al. 2004hllat Shim et alJl2007t iReddv et aUl2008t 
iLv et al.l201 lb Bouwens et al.l20l4 and many more). While this 
technique is efficient at selecting high redshift galaxies, it is also 
affected by significant incompleteness and contamination, los¬ 
ing some fraction of the population at the selected redshift, or 
allowing galaxies at other redshifts to enter the sample. While 
the latter can be dealt with b y obtaining spectros copic redshifts 
(see e.g. lSteidel et al.ll996allbl : lReddv et alJ2006l) . the former re¬ 
mains a serious difficulty. We are not yet at the point of spectro¬ 
scopic blind surveys, hence, a step forward towards less biased 
candidate selection is offered by multifilter surveys. They com¬ 
bine the efficiency and unbiased nature of photometric surveys 
with very low resolution spectral information, permitting us to 
derive more information on the surveyed objects such as their 
accurate photometric redshifts. 


Many authors (e.g. IShim et al.1 l2007t ILv et al.1 1201 ll) have 
combined the data of their colour selected LBG samples with in¬ 
formation at other passbands in order to carry out spectral energy 
distribution (SED) fitting and to derive more information on the 
objects in question, like their photometric redshifts. However, 
basing the actual candidate selection on photometric redshifts 
as, for example, iMcLure et al.1 (2 006 1 have done, has only re¬ 
cently started to become a common practice. As discussed by 
IMcLure et al.1 (1201 ll) . when multifilter data are available this ap¬ 
proach has several advantages over the traditional colour selec¬ 
tion. It makes the best use of the available information in mul¬ 
tiple filters, it should be less biased as any colour preselection 
is not required, and it directly offers the photometric redshifts 
for the galaxies of interest and allows the competing photomet¬ 
ric redshift solutions at low redshift to be investigated. Recently, 
iLe Fevre et al.l ( 2014 ') have used photometric redshifts to select 
an unbiased target sample of high redshift galaxies for the VUDS 
survey. Photometric redshift selection is also used in the frame¬ 
work of the CLASH survey (e.g. iBradlev et al.l 12014 ) and in 


the re cent works of iFinkelstein et al.l (120141) and iBowler et al.l 
(12014 . 

In this paper we introduce a method for studying high red¬ 
shift (z ~ 2 - 5) galaxies based on their photometric redshifts. 
Our study makes use of the complete redshift probability dis¬ 
tribution functions (zPDFs), rather than the best redshift (i.e. 
the median derived from the zPDF or the highest peak of the 
zPDF). We show how a very clean candidate selection can be 
made based on the zPDFs and discuss how this technique also 
suffers from contamination issues if completeness is tried to be 
reached. Finally, we discuss why for many statistical purposes 
candidate selection is not needed. Instead, these studies can be 
based directly on the redshift (and the corresponding luminos¬ 
ity, mass, star formation rate, etc.) probability distributions. As 
an example we present probabilistic number counts for several 
high redshift bins. These counts should be free from contami¬ 
nation and incompleteness issues, if the used zPDFs correctly 
reflect the uncertainties in the redshift estimations. 

The method is developed and tested with the data from the 
Advanced Large, Homogeneous Area Medium Band Redshift 
Astronomical (ALHAMBRA, iMoles et~akl 120081) Survey. The 
total area used for our study is 2.38 deg 2 , covered with 20 
medium band optical filters, plus /, H, and K s in the near in¬ 
frared (NIR). In addition to the novel methodology, an advantage 
of our ALHAMBRA high redshift galaxy study as compared to 
the previous LBG studies is the large area the survey covers, split 
into eight independent fields, reducing biases due to the cosmic 
variance and allowing the study of the rarest, brightest, high red¬ 
shift galaxies. Galaxies at z ~ 1 in ALHAMBRA (plus GALEX, 
IRAC, MIPS, and PACS) were studied in iOteo et al.1 (l2013allbl) . 
In this first paper about ALHAMBRA z > 2 galaxies, we con¬ 
centrate on the methodology of studying the galaxy properties 
using the whole information in their zPDFs. In the subsequent 
papers, this methodology will be applied to studying the proper¬ 
ties of these galaxies. 

The methodology presented here is generic and can be ap¬ 
plied to any multifilter data set with accurate zPDFs, such as the 
data from the SH ARDS (the Survey for High- z Absorption Red 
and Dead Sources. fPerez-Gonzalez et~aPl2013b . and the future J- 
PLUS (Javalambre Photometric Local Universe Survey, Cenarro 
et al., in prep.) and J-PAS (Javalambre-PAU Astrophysical 
Survey, iBenitez et al.ll2014h . 

The outline of this paper is as follows. Section [2] describes 
the ALHAMBRA data used for this study. Section [3 gives an 
introduction to the ALHAMBRA photo-z derivation and its va¬ 
lidity for high redshift galaxies. In Sect. 4 the sample selection is 
described, the contamination and completeness of the sample are 
discussed, and our sample selection is compared with the tradi¬ 
tional dropout selections. In Sect. [3 the probabilistic approach is 
introduced and the rest frame UV number counts are derived. A 
summary is given in Sect. [6] Where necessary, we assume O m = 
0.3, Ga= 0.7, and Hn = 70 km s _1 Mpc~'. Magnitudes are given 
in the AB system (Oke & Gunnll 1 983l) . 


2. Data 

ALHAMBRA ( Moles et al . 2008 ) has mapped a total of 4 deg 2 
of the northern sky in eight separate fields during a seven year 
period (2005 - 2012). Of the total surveyed area, 2.8 deg 2 have 
been completed with all the filters (2.38 deg 2 after masking, 
as will be detailed in Sect. II). ALHAMBRA uses a specially 
designed filter system (lAnaricio Villegas et al.1 [201 0!) that cov¬ 
ers the optical range from 3500 A to 9700 A with 20 con¬ 
tiguous, equal width (~300 A FWHM), medium band filters. 
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A [A] 


Fig.l. The z ~ 3 composite LBG spectrum of IShanlev et al.l (120031) moved to redshifts z = 2.2,3.0,4.0, and 5.0 considering the 
variation of the intergalactic opacity with redshift (blue lines). For clarity, the spectra are shifted vertically and plotted only up to 
1460A (restframe). The ALHAMBRA optical filter transmission curves are overplotted as shaded grey areas, and the dashed red 
line corresponds to the synthetic E814W filter. The first filter redwards of the Ly -a line in each redshift is marked in darker grey. [A 
colour version of this figure is available in the online edition.] 

plus the three standard broadbands, /, H , and K s , in the NIR. 

The photometric system has been specifically designed to op¬ 
timise photometric redshift depth and accuracy dBemtez et al.l 
2009 ). The observations were carried out with the Calar Alto 
3.5m telescope using two wide field cameras: LAICA in the op¬ 
tical, and OMEGA-2000 in the NIR. The 5cr limiting magni¬ 
tude reaches > 24 for all filters below 8000 A and increases 
steeply towards redder medium-band filters, up to m(AB) ~ 

21.5 for the reddest optical filter at 9700 A (see Fig. 37 of 
[Molino et alil2014l ). In the NIR the limiting magnitude is ~ 23 
for J, ~ 22.5 for H, and ~ 22 for K s . For details about the 
NIR data reduction see Cristobal-Hornillos et al.l (120091) . while 
the optical reduction is described in Cristobal-Hornillos et al. 

(in prep). The ALHAMBRA object catalogues and the asso¬ 
ciated Bayesian photometric redshifts (BPZs) are described in 
iMolino et al] (120141) and are available through the ALHAMBRA 
web pagtfj At the moment only the best BPZs are public; the 
full zPDFs will be published in the future. In Fig. Q] we show the 
transmission curves of the optical ALHAMBRA filters together 
with the z~3 composite spectrum of 811 LBGs of lShanlev et al.l 
( 2003 ) moved to different redshifts. 

3. Photometric redshifts 

The work in this paper relies on the photometric redshifts pro¬ 
vided for all the objects in the ALHAMBRA catalogue as de¬ 
tailed bv IMolino et al.l ( 2014 1. These photometric redshifts were 
estimated using BPZ2.0 (Benitez et al., in prep), an updated ver¬ 
sion of the Bayesian Photometric Redshift (BPZ) code dBenltezI 
1200(1 ). This code uses Bayesian inference where a maximum 
likelihood, resulting from a x 2 minimisation between the ob¬ 
served and predicted colours for a galaxy among a range of 
redshifts and templates, is weighted by a prior probability. The 
maximum likelihood (ML) method may suffer from colour- 
redshift degeneracies (like 4000A break vs. Lyman break) and 

1 http://alhambrasurvey.com 


the inclusion of a suitable prior information can help to break 
these degeneracies. However, both maximum likelihood and 
Bayesian redshift probability distributions are available for all 
the ALHAMBRA sources. 

The BPZ2.0 SED library (see IMolino et al.l 12014l) consists 
of 11 SEDs: five templates for elliptical galaxies, two for spi¬ 
ral galaxies, and four for starburst galaxies along with emis¬ 
sion lines and dust extinction. The opacity of the i nterga lactic 
medium has been applied as described in iMadatil ( 1995 ). The 
prior used gives the probability of a galaxy with apparent mag¬ 
nitude mo having a certain redshift z and spectral type T. The 
prior has been empirically derived for each spectral type and 
magnitude by fittin g luminos ity functions pr ovided by GO ODS- 
MUSIC (Sant ini etal.ll2009l) . COSMOS (IScoville et all l2007h . 
and UDF dCoe etaTll200fih . 

For each catalogued ALHAMBRA object both the maxi¬ 
mum likelihood and Bayesian redshift probability distribution 
functions (zPDFs) are given separately for each template used in 
the ^-fitting. We are not interested in limiting ourselves to any 
galaxy type, hence, we use the redshift PDFs integrated over all 
templates, and normalised to one: 


/ = If PDF(z,T)dzdT = 1. 


( 1 ) 


These zPDFs give the probability along the redshift axis of a 
galaxy in question to be at that redshift. Hence, the probability, 
p, that a galaxy is within the redshift bin z\ < z < Zi is 


-f 


PDF(z)dz. 


( 2 ) 


3.1. ALHAMBRA redshifts for high-z galaxies 

The first questions to solve before blindly using the photomet¬ 
ric redshift information for analysing high redshift galaxies are: 
Can we really trust these redshifts for high-z galaxies? Is it more 
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reliable to use the maximum likelihood (ML) or the Bayesian, 
full probability (FP), redshift probability distributions? 

The idea of the prior information is to reduce the redshift 
estimation uncertainties. However, the prior information should 
be used only if it really can be trusted. The complete census 
of high redshift galaxies is still poorly known and the known 
census is most probably biased (see e.g. Le Fevre et al.l[2014 1. 
Hence, using any prior information based on such a census could 
introduce undesired biases or uncertainties. For this reason, our 
answer to the second question above would a priori be to base 
our study on the ML redshift information. This will be further 
studied in the following. 

While the accuracy of the ALHAMBRA BPZs is well tested 
and demonstrated dMolino et al.ll2014() for galaxies up to z~ 1.5 
(being ~ 1%), this is not the case for the galaxies that we are 
interested in because of the small number of spectroscopic red- 
shifts for high redshift galaxies in ALHAMBRA area. In the 
sample of ~ 7200 galaxies with spectroscopic redshifts used to 
verify the ALHAMBRA photo-z accuracy (iMolino et al J l2014 L 
there are only 57 with redshifts above z = 2.2 (the lowest red¬ 
shift at which the Lyman forest would be sampled by at least 
one ALHAMBRA filter). Of these only 12 are brighter than 
m = 24 in the first filter redwards of the Ly -a line. A litera¬ 
ture search reveales that five of these are classified as quasars, 
and as the BPZ template library does not include quasar spec¬ 
tra, we do not expect to be able to accurately recover their 
redshifts. Hence, we are left with seven spectroscopically con¬ 
firmed bright normal high redshift galaxies. In Fig. [2] we show 
the ALHAMBRA ML and FP zPDFs for these seven galaxies 
together with their ALHAMBRA coordinates and spectroscopic 
redshift (from [Barger et al.112008 '). We see that for five of them 
(Nos. 1, 3, 4, 6, 7), the redshift is reasonably well recovered, 
A z < 0.3, where A z is the difference between the first peak of 
the zPDF and the spectroscopic redshift. For galaxy 4, whose 
shift between the spectroscopic and photomnetric redshift is the 
largest of the five, the shift corresponds almost exactly to a width 
of one filter, i.e. it seems BPZ has mistaken the location of the 
Ly-a break by one filter. Of the remaining two, for galaxy 2, the 
first peaks of both ML and FP zPDFs are located at low redshift, 
but the peaks at high redshift enclose most of the probability. 
Galaxy 5 shows a secondary peak at high redshift (higher than 
the spectroscopic redshift), but most of the probability resides 
at low redshift. The spectra of these objects are not public mak¬ 
ing it hard to further study the reason for these discrepancies 
between the spectroscopic and photometric redshifts. However 
from the ALHAMBRA SEDs we infer that, most probably, these 
discrepancies derive from the common confusion between the 
4000A break and Lyman break. 

To have a better control on the expected redshifts, and a 
wider range of magnitudes to be tested, we carried out a sim¬ 
ulation. For this purpose we used the z = 3 composite LBG 
spectrum of IShanlev et alJ ( 2003 ). We moved this spectrum to 
different redshifts: z = 2.2,3.0,4.0,5.0. The lowest redshift was 
selected such that the Lyman forest would be sampled at least by 
one ALHAMBRA filter, while considering the previous work on 
LBG number counts (e.g. lYoshida et all2006l) . we do not expect 
to discover many galaxies above z = 5 owing to the magnitude 
limits of ALHAMBRA. 

To simulate the different redshifts, the original spectrum was 
first moved to redshift z=0 removing the effect of cosmic opac¬ 
ity using the equations of iMadaul ( 1995 ). then the same equa¬ 
tions were used to simulate the spectra at different redshifts. The 
original spectrum cover the wavelength range from 920 A to 


2000 A. To cover the whole ALHAMBRA optical wavelength 
range in the simulated redshifts, we artificially extended it as¬ 
suming a flat behaviour of the UV continuum ( F v =constant, i.e. 
F, i oc A 2 ) from 2000 A redwards up to the B aimer break at 4000 
A, and from 920 A bluewards, down to 912 A where the flux is 
assumed to drop abruptly adopting a cosmic opacity r e // = 10 
for A < 912 A. 

The resulting spectra were convolved with the ALHAMBRA 
filters. The convolved spectra were scaled to the desired magni¬ 
tudes (at the first filter redwards from the Ly-cr) to sample the 
magnitude range m = 20.2 - 24.0. The lower magnitude was 
defined so that we really could expect to have galaxies of this 
magnitude at our lowest redshift bin (see ILv et al.ll201 II) . while 
the upper limit was set to reach the ALHAMBRA sensitivity 
limit. 

We also considered realistic errors for each magnitude 
at each filter. To obtain these, we selected one arbitrary 
ALHAMBRA field and studied how the magnitude error varied 
with magnitude for each filter. Using all the objects in the field, 
we created mag vs. magjerr curves for each filter and found the 
best fitting solutions of the form magjerr - a + b * e c * mag . The 
expected errors at each magnitude and filter were then obtained 
from these equations and assigned to the simulated LBG spectra. 

Finally, a Monte Carlo simulation was carried out. Each 
LBG spectrum was perturbed inside its error bars 100 times. 
When the simulated magnitude was below the 1 cr detection limit 
(adapted again from one arbitrary ALHAMBRA field), it was 
replaced by this lcr limiting magnitude, as required by BPZ for 
non-detections. The BPZ code was run for each of the simulated 
spectra to obtain both their FP and ML redshift probability dis¬ 
tributions. We studied the recovered distributions with two ques¬ 
tions in mind: 1) How well can we recover LBGs as high redshift 
galaxies? For this, we used equation (O and tested how often the 
galaxies would be recovered to have a probability p > 0.9 to be 
within a redshift bin 1.9 < z < 5.3. The redshift bin was selected 
to be wider than the range of the input redshifts so that small 
errors in redshift would not place the borderline objects outside 
the tested range; and 2) How accurately is the redshift of these 
simulated galaxies recovered? 

The recovery rate of LBGs as high-redshift galaxies is sum¬ 
marised in Table Q] where the percentage of simulated LBGs 
having a probability greater than 90% to be within the desired 
redshift range for each input redshift and magnitude are listed 
for both the FP and ML redshift distributions. We see that, in 
general, the recovered fraction is worse for the z — 2.2 LBGs 
than for the higher redshift sources. We assume that the lower 
redshifts are recovered with less accuracy, because the lower 
the redshift, the less pronounced is the characteristic Ly-cr break 
and it is seen with fewer filters. We also see that while the ML 
method recovers the high redshift nature of the simulated galax¬ 
ies very well, the Bayesian approach gives worse results. It sys¬ 
tematically fails for the brightest magnitudes, reducing the prob¬ 
ability of the LBGs to be at high redshift below our 90% limit, 
and also starts failing for the fainter magnitudes earlier than the 
ML approach. 

We note that according to earlier studies (see lYoshida et al.l 
l200d ) for galaxies at redshift z > 4 we possibly could not ex¬ 
pect to observe rest frame UV magnitudes brighter than 22. This 
would partially justify why the Bayesian approach fails to re¬ 
cover the redshifts of these galaxies. If in addition the bright¬ 
est end of the ILv et al . (1201 1) surface density plots were dom¬ 
inated by interlopers, the redshift recovered by the Bayesian 
method could actually be reasonable. However, knowledge of 
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Fig. 2. The maximum likelihood (solid blue line) and Bayesian (dashed green line) zPDFs for eight galaxies with spectroscopic 
redshifts. The spectroscopic redshifts (z s ) are given in each panel and are also marked as dashed black vertical lines. See the text for 
more details. [A colour version of this figure is available in the online edition .] 


Table 1. The percentage fraction of simulated LBGs of different 
redshifts and magnitudes fulfilling our selection criterion. 


Maximum likelihood Bayesian 


Mag" 

2.2 

3.0 

4.0 

5.0 

2.2 

3.0 

4.0 

5.0 

20.2 

100 

100 

100 

100 

0 

0 

0 

0 

21. 

99 

100 

100 

100 

0 

0 

0 

0 

22. 

95 

100 

100 

100 

70 

14 

9 

1 

23. 

73 

100 

100 

100 

61 

99 

100 

0 

24. 

55 

95 

99 

97 

38 

68 

77 

35 


the high redshift galaxy population is still very incomplete and 
most probably biased. Hence, basing any study on prior knowl¬ 
edge of such a population might be dangerous and could lead to 
further biases as our simulation also indicates. We do not want 
to take the risk of losing the especially interesting bright objects. 
For these reasons, we decided to base our study on the ML red¬ 
shift probability distributions, i.e. to assume a flat prior. 

To test the accuracy of the recovered redshifts, we summed 
the ML zPDFs of the simulated LBGs in order to see how well 
the input redshifts were recovered. In Fig. [3] we show these 


Table 2. The recovered redshifts for 100 simulated LBGs at dif¬ 
ferent magnitudes and redshifts. Presented are the average value 
and its standard deviation derived from Gaussian approxima¬ 
tions of the summed zPDFs in Fig. [3] 


Mag 

;in=2.2 

z in=3.0 

jjn=4.0 

z in=5.0 

20.2 

2.46 ±0.02 

2.92 ±0.02 

3.909 ±0.007 

4.975 ±0.001 

21.0 

2.44 ±0.04 

3.2 ±0.2 

3.91 ±0.01 

4.975 ±0.008 

22.0 

2.5 ±0.2 

3.2 ±0.1 

3.91 ±0.01 

4.97 ±0.02 

23.0 

2.5 ±0.3 

3.2 ±0.1 

3.90 ±0.03 

4.97 ±0.03 

24.0 

2.5 ±0.5 

3.2 ±0.2 

3.89 ±0.07 

4.98 ±0.06 


summed zPDFs at three different magnitudes. In Table [2] we list 
the input redshifts and the recovered average redshifts and their 
sigma, derived from Gaussian approximations of the summed 
zPDFs. The recovered redshifts generally show a bias towards 
smaller or higher z than the input redshift, the bias becoming 
smaller with increasing z. It is not surprising that the redshift is 
worse recovered at the lower simulated z, as at lower z the Lyman 
forest is sampled by fewer ALHAMBRA filters (see Fig.Q}, and 
the Ly-o' break is less pronounced at lower redshifts. However, 
it is intriguing to see that even though the recovered redshift 


5 
























K. Viironen et al.: High redshift galaxies in the ALHAMBRA survey 





Fig. 3. Recovered summed redshift distributions, normalised to 
one at the integrated probability, for 100 simulated LBGs at 
z=2.2 (blue lines), z=3.0 (red lines), z=4.0 (black lines), and 
z=5.0 (green lines) for three differentrest frame UV magnitudes. 
The solid lines correspond to the simulation with the original 
composite LBG spectrum, and the dashed and dotted lines to 
the simulations with the same spectrum, but the Ly-a line re¬ 
moved and doubled, respectively. [A colour version of this figure 
is available in the online edition .] 


becomes more and more peaked towards higher z, a system¬ 
atic bias towards smaller redshift remains. Bayesian Photometric 
Redshift templates do not include the Ly-a emission line, while 
this line is present in the composite spectrum used for the simu¬ 
lations. The presence of the line could dilute the Ly-a break and 


cause the bias in the redshift estimation towards lower z. To test 
this hypothesis, we manually removed the Ly-a line from the 
composite spectrum, and repeated the simulation. We also re¬ 
peated the simulation doubling the Ly-a line strength. We have 
plotted the resulting summed zPDFs in Fig. [3] together with the 
original results. In the two largest simulated redshifts the ten¬ 
dency of increasing Ly-a line strength to increasingly underesti¬ 
mate the redshift is obvious. At the two lower redshifts this is not 
enough to explain the involved uncertainties. However, in all the 
simulated redshifts the average sizes of the biases are Az < 0.3, 
and, since we work with rather rough redshift bins, we consider 
the obtained accuracy acceptable. 

4. A sample selection approach 

In this section we present one way of selecting a clean sample of 
high redshift galaxy candidates, using the zPDFs, and check it 
against traditional dropout selections. Our sample selection con¬ 
sists of two steps: cleaning the catalogue from non-desired de¬ 
tections, and applying a redshift selection. While the first step is 
always needed, we will discuss later that, while sometimes use¬ 
ful, for many purposes a redshift selection is actually not needed. 

4.1. Catalogue selection 

We start our candidate selection by cleaning the ALHAMBRA 
catalogues of any possible spurious or false detections, du¬ 
plicated detection s, and stars. For t his pu rpose we used the 
masks defined in lArnalte-Mur et al . (12014!) describing the sky 
area which has been reliably observed, and the stellar fl ag pro - 
vided in the ALHAMBRA catalogues (see iMolino et alJl2014h . 
setting ’’Stellar-Flag” < 0.51 in order to remove stars. This 
should remove the stars up to m < 22.5 in the reference filter, 
F814W. Above this magnitude the stellar flag is not defined, 
and slight contamination by faint stars may remain. However, 
for fainter magnitudes, the fraction of stars compared to galax¬ 
ies declines rapidly, with a contribution of ~ 10% for magni¬ 
tudes m(F814W) = 22.5, declining to ~ 1% for magnitudes 
m(F8l4W) = 23.5 (IMolino et al.ll2014h . After these steps, our 
data consist of a total of 362788 galaxies in 2.38 deg 2 . 

4.2. The redshift selection 

There is no one single correct way of applying zPDFs for 
candidate selection. The best redshift (e.g. the first peak) can 
be derived from the zPDF and assigned to each galaxy (e.g. 
iLe Fevre et al.l [2014] ) or the zPDF can be integrated and used 
in one way or another to select a list of candidates (e.g. 
iMcLure et al.l20lItlDuncan et al.l20T4t) . Here we use the second 
approach in a very simplified way in order to select a clean sam¬ 
ple of high redshift ALHAMBRA galaxies. With this approach 
one is not obliged to be limited to any specific redshift range. 
However, we limit our study to the redshift range 2.2 < z < 5.0. 
The lower limit is set so that we sample the Lyman forest, i.e. 
the spectrum bluewards of the Lya line, with at least one fil¬ 
ter. Because of the depth of ALHAMBRA we do not expect to 
find many galaxies at the upper limit of z > 5.0. In addition, the 
ALHAMBRA sensitivity limit worsens rapidly for wavelengths 
above ~ 8000A, and with the upper redshift limit we make sure 
to measure the UV continuum redwards the Lya break in at least 
two filters bluer than 8000A. 

When all of the information on the redshift probability distri¬ 
bution is used, one can select as candidates all the galaxies that 
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Fig. 4. zPDFs for two galaxies with very different ML Odds 
(black lines). Top: ML Odds- 0.898, Bottom: ML Odds= 0.084. 
Overplotted are the corresponding Gaussian approximations of 
the distributions (dashed red lines). [A colour version of this fig¬ 
ure is available in the online edition.] 

have a probability greater than a given threshold of being at the 
desired redshift interval. This threshold can then be selected to 
obtain the desired balance between completeness and contami¬ 
nation. To introduce this technique, we decided to opt for a clean 
selection and select as candidates the objects fulfilling the crite¬ 
rion 



o-(z) 

Fig. 5. The redshift error distribution for a Gaussian approxima¬ 
tion for our sample of galaxies at high redshift. 


it is related to the confidence of the photometric redshifts, mak¬ 
ing it possible to derive high quality samples with better accu¬ 
racy and a lower rate of catastrophic outliers. As an example, in 
Fig.0we show two zPDFs satisfying criterion <0, but with very 
different ML Odds parameters. We also show the Gaussian ap¬ 
proximations of the corresponding redshift distributions. While 
it is obvious that a selection in Odds can refine the redshift se¬ 
lection, we do not make any further selection of this kind. We 
do not need such a high precision in redshift in order to reject 
the objects like the one in the bottom panel of Fig. [4] Instead, 
for all of the galaxies that pass our selection criterion, we calcu¬ 
lated the cr of their redshift distribution for the Gaussian approx¬ 
imation. The cr distribution for the sample galaxies is shown in 
Fig. 0 The average and median of this distribution are 0.13 and 
0.11, respectively. We consider this precision to be high enough. 
Alternatively, in the probabilistic approach (Sect. [5J, the whole 
redshift distribution is taken into account in a natural way; how¬ 
ever, we will come back to the question of Odds in Sect. 0 We 
note that in addition to the random errors, we can expect to have 
systematic errors in the derived redshifts, as was discussed in 
Sect. 13. II and shown in Fig. 0 and Table 0 However, consider¬ 
ing the expected size of these biases, when working with coarse 
redshift bins, this should not be a problem. 



PDF(z) dz > 0.90, 


(3) 


i.e. all the galaxies with a probability of 90% or higher of being 
at the redshift range that we are interested in. This leads to a 
sample of a total of 9203 high redshift galaxies. 

We note that methodologically this selection could easily be 
further refined, if needed. One way would be to study the con¬ 
centration of the probability distribution around its peak value, 
e.g. by calculating the ML analogy for the Odds parameter of¬ 
fered by the BPZ (lBemtezl|2000l) . The Odds quality parameter 
is a proxy for the photometric redshift reliability of the sources. 
The Odds parameter is defined as the redshift probability en¬ 
closed on a ±K( \ + z) region around the main peak in the zPDF 
of the source, where the constant K is specific for each photo¬ 
metric survey. Molino et al. (2014) find that K=0.0125 is the 
best value for ALHAMBRA since this is the expected averaged 
accuracy for most galaxies in the survey. Thus, Odds e [0,1] and 


4.3. Completeness and contamination 

We estimate here the expected level of contamination and in¬ 
completeness of our sample within the limiting magnitude of 
ALHAMBRA and assuming that the zPDFs correctly reflect the 
uncertainties in the redshift estimations. 

4.3.1. Contamination 

Presuming the assumption of a flat prior is true, the upper limit 
for the contamination is directly set by our selection criterion: 
as we select the objects with a > 90% probability of being at 
desired redshift, we automatically allow a contamination of < 
10% by galaxies at other redshifts. To get a more exact value of 
the expected contamination, we summed the probabilities of the 
objects selected by criterion (0 within the redshift interval 2.2 < 
z < 5.0. The resultant average probability is ~96.5%, meaning 
that we could expect a contamination by lower redshift galaxies 
of only 3.5%. We can expect a low level of contamination as our 
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selection criterion ([3} is rather strict; there may be galaxies with, 
e.g., a > 50% probability of being within our redshift bin but 
which are not selected by our criterion. This naturally leads to a 
low level of completeness in our sample as will be discussed in 
the next section. 

We expect that the most significant source of contamination 
of our high redshift galaxy sample are the faint red galaxies at 
low redshifts because of the confusion between the 4000 A and 
Lyman breaks. In addition, some faint cold stars may be in¬ 
cluded, as the preselection against stars is statistical and not de¬ 
fined for magnitudes fainter than m ~ 22.5 (iMolino et alj|2014h . 
and the noisy spectra of cold stars could be confused with the 
LBG spectra. 


4.3.2. Completeness 

The completeness at a given redshift bin is defined as the ratio 
of galaxies at the corresponding redshift that are detected and 
that also pass the selection criteria to all the rest frame UV-bright 
galaxies at the given redshift bin actually present in the Universe. 
It has been shown in lMolino et al.l d2014l) that the ALHAMBRA 
catalogues are ~ 100% complete up to m = 24 in the F814W de¬ 
tection filter. For the high redshift galaxies that we are interested 
in, this filter traces the UV continuum redwards of the Ly-a, the 
Ly-a break only slightly entering the F814 W passband at z = 5 
(see Fig. [T]). Hence, considering the UV continuum of the LBGs 
are generally flat (F v =constant), we can also expect a complete 
detection up to m = 24 in the first filter towards the Ly-cr line for 
the galaxies that we are interested in. 

If the flat prior assumption is correct, the expected com¬ 
pleteness due to our candidate selection can be derived by sum¬ 
ming the probability distributions within the redshift interval that 
we are interested in for all the objects in our cleaned catalogue 
which do not fulfil our selection criterion, i.e. 

r»5.0 

A' = ^ | PDFi(z)dz (4) 

; 2.2 


for all the objects i fulfilling the criterion 



PDFj(z)dz. < 


0.9. 


(5) 


This sum gives the expected total number of galaxies that are 
located in the redshift range that we are interested in, but not se¬ 
lected as such by our criterion. The total number is 40166.8, i.e. 
~ 4.4 times the objects in our sample. The completeness could 
be made higher by relaxing the criterion ([!}, at the cost of in¬ 
creasing the contamination. To carry out statistical studies on the 
high redshift galaxy population, we certainly should find a bet¬ 
ter compromise between the contamination and completeness. 
However, we will discuss later on how statistical studies can be 
carried out using the zPDFs directly without any previous candi¬ 
date selection. Hence, we stick to this candidate selection, which 
we know to be clean but incomplete, and we name it the clean 
sample. 


4.3.3. Quasars 

Quasar spectra are not included in the BPZ spectral templates. 
Hence, our selection can contain quasars, but we do not ex¬ 
pect a complete selection of quasars. We tested if the known 
quasars observed by ALHAMBRA would fulfil our redshift se¬ 
lection criterion ©. In total we found 205 ALHAMBRA objects 


that had counterparts identified as quasars with spectroscopic 
redshift in other surveys. They consist of 170 sources from 
iMatute et al.l (12012h (see also refer ences therein), one quasar at 
Z = 5.41 from IMatute et al.l d2013l) , 15 s ources from the SDSS 
quasar catalogue DR 10 ( iParis et aHl2014l) . and 19 X-ray sources 
from CHANDRA that have an associated optical and infrared 
counterpart JCivano et al.l i2012h . For the CHANDRA sources 
we also demanded that they were classified as point sources 
and their variability parameter was greater than 0.25, in agree¬ 
ment with ISalvato et al.l d2009h . Of these 205 quasars, 48 have 
a spectroscopic redshift in the range that we are interested in 
(2.2 < z < 5.0) and 2 are at higher redshifts (z. = 5.07 and 
z = 5.41), while the rest are located at lower redshifts. Of the 
objects at the redshift interval that we are interested in, 19 (40%) 
fulfil our z-selection criteria. In addition, 11 out of the remaining 
155 objects at lower-z (7%) enter our z-selection. The quasar at 
z = 5.07 is placed at z — 4.88 ± 0.03, i.e. also enters our redshift 
selection, and the quasar at z = 5.41 is placed at z = 5.30 ± 0.02, 
if Gaussian approximations are used (which in these cases is a 
good approximation as the redshift distributions show only one 
significant peak). However, the stellarity flag removes most of 
the quasars from our final sample so that in the end only five high 
redshift quasars (10%) and two (1.3%) lower redshift quasars en¬ 
ter our sample. We expect to have a better control of these objects 
once the ALHAMBRA quasar catalogue is available (Chaves- 
Montero et al., in prep.). 


To get an estimation of the maximum expected contamina¬ 
tion of quasars in our clean sample, we compared the /-band 
number counts of quasars at the redshift range z. = 2.2 - 3.5 
(iRoss et al.l 120131 ) to the total number of objects in our sample 
at the same redshift range. The redshift range z - 2 - 3 is of¬ 
ten known as the quasar epoch (ICroom et al.l 120091) , as this is 
where the number density of bright quasars peaks. Hence, the 
comparison at this redshift range gives an upper limit of the ex¬ 
pected total contamination by high redshift quasars in our sam¬ 
ple of high redshift galaxies. From the double power-law fit to 
the cumulative /-band number counts of quasars at z = 2.2 - 3.5 
dRoss et al.lt2013h . a total surface density of 263 deg -1 quasars 
with nti <= 24 is derived. Hence, in the ALHAMBRA area we 
would expect to have 2.38 degx263 deg -1 = 626 quasars brighter 
than nti = 24. The total number of galaxies brighter than m = 24 
in our clean sample at the same redshift bin is 1707. We roughly 
compare these numbers without considering a /.'-correction be¬ 
tween the /-band and our ALHAMBRA bands. If 10% of the 
quasars at the ALHAMBRA area enter our selection, as we in¬ 
fer from the spectroscopic sample, the total (maximum) rate of 
contamination of our clean sample by high-redshift quasars is 
0.1 x 626/1707 = 0.037, i.e. < 4%. 


In addition, we showed above th at 1,3% of quasars at z < 2.2 
can be included in our sample. IRoss et all d2013l) also give the 
prescription to calculate the quasar surface density at the redshift 
range 1.0 < z < 2.2, giving 99.6 deg -1 quasars with m, <= 
24. Doubling this to account for (i.e. overestimate for the much 
smaller volume) the quasars at the redshift range 0.0 < z < L0, 
gives an additional maximum contamination of 2 x 99.6 deg -1 x 
2.38 deg x 0.013 = 6 quasars brighter than m, = 24 at 0.0 < 
z < 2.2 in the ALHAMBRA area. If this is compared to the total 
amount of galaxies brighter than in = 24 in our clean sample 
(2296 galaxies), an additional maximum contamination rate of 
0.3% is obtained. 
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Fig. 6. Locations of our clean sample candidates in four colour-colour diagrams used for traditional dropout selections. The selection 
boxes in each diagram are shown with dashed lines and the redshift ranges they target are indicated in each panel. We only plot 
the candidates in the redshift range within lcr of the one targeted by these diagrams (blue crosses). In the top left diagram the blue 
crosses refer to the BX selection while the magenta dots to the LBG selection. See the text for more details. [A colour version of 
this figure is available in the online edition .] 


4.4. Comparison with traditional colour selections 


To see if the candidates in our clean sample would have been se¬ 
lected by traditional dropout methods, we tested how they would 
be located in some traditional colour-colour diagrams. In partic¬ 
ular, we opted for testing the BX selection ((z) = 2.20 + 0.32) 
" (12004 : the LBG selection «z> = 2.96 + 0.29) 
(i2003h : and the BRi' ((z> = 4.0 ± 0.3), Vi'z' 


Steidel et al 


Steidel et al. 


of 
of 

(( z) = 4.7 ± 0.3), and Ri'z' «z> = 4.9 + 0.2) LBG selections 
of lYoshida et al.1 (i2006i) . 

First, we carried out SED fitting on our sample galaxies 
in order to find a spectrum which we could then convolve 
with the broadband biters used in the above dropout selections. 
To assure a good SED-btting, we considered only the galax¬ 
ies with good quality photometry in all of the biters by set¬ 
ting ”irms_OPT_Flag” = 0 and ”irms_NIR_Flag” = 0. This re¬ 
quirement reduced our sample to 8023 galaxies. For the SED 
btting we used the singl e stellar population (SSP) models of 
iBruzual & Chariot (i2003b of all the available metallicities (six 
metallicity values in the range Z = 0.001 - 0.05) and of 40 ages 
roughly logarithmically spaced from 10 Myr to the_age of the 
Universe. We added the extinction law of iLeithereretaL j 20021 
at the wavelength range 970 A-1200 A, and that of lCalzetti et al l 
( 200 0) for longer wavelengths. At wavelengths below 970 A, 
where neither of the two laws is dehned, we adopted a constant 


extinction with a value equal to that at 970 A. The colour excess, 
E(B - V ), was varied in a range of realistic values: from 0.0 to 
0.5 (IShanlev et al.ll2003h in steps of A E(B - V) = 0.025. The 
model spectra were moved in redshift in steps of Az = 0.025 to 
sample the redshift range that we are interested in, so that at each 
redshift only the SSPs up to the age of the Universe at that time 
were considered. The Lyman forest was modelled following the 


prescriptions of lMadaul(ll995h . considering the Her, H/l, Hy, and 
H<) line blanketing. Below 912 A the opacity was assumed to in¬ 
crease abruptly, leading to practically zero bux bluewards of the 
Lyman break. 

These template spectra were convolved with the 
ALHAMBRA biter passbands. Each galaxy in our sample 
was btted by this template library using the ^-method so that 
only the templates with redshifts Ztempiate = (z) ± cr z were 
considered, i.e. those templates whose redshift is inside lcr from 
the median redshift of the btted galaxy as derived from its zPDF. 
The template spectrum whose bt produced the lowest value of 
X 2 3 was then assigned as the best bt template for each galaxy in 
our sample. Finally, only the galaxies brighter than m = 24 in 
the brst biter redwards from the Ly-ar line and with the reduced 
xfi < 2 (x 2 r = T 2 /(l ~ A0, where N is the number of biters used 
in the bt) were accepted for the analysis. These steps reduced 
our sample to 1844 and 1327 galaxies, respectively. 

The original spectra of these best bt templates were then 
convolved with the biter passbands of the broadband biters of 
interest and the objects were placed in the colour-colour dia¬ 
grams used in the dropout selections (Fig. [6}. To simulate the G, 
R, and U n passband data used in the selections of [Steidel et al. 
(120031) and lSteidel et al.l (120041) . we downloaded the correspond¬ 
ing transmi ssion curves from KP NO websit^B To simulate the 
selection of lYoshida et al.l ( 2006 ). the B, R, V ,and z! transmis¬ 
sion curves were downloaded from NAOJ websitcQ. 

In each diagram in Fig.[6]we plotted only those candidates of 
our sample whose (ALHAMBRA median) redshifts are within 
lcr from the one targeted by the corresponding dropout selec- 

2 https://www.noao.edu/kpno/mosaic/blters/blters.html 

3 http://www.naoj.org/Observing/Instruments/SCam/sensitivity.html 
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Fig. 7. Redshift probability distribution of a galaxy with a sig¬ 
nificant probability at both high and low redshift (black line). 
Overplotted is the corresponding Gaussian approximation of the 
distribution (dashed red line). [A colour version of this figure is 
available in the online edition .] 

tions. We see that basically all of our candidates would also be 
selected by these traditional colour-colour diagrams. The per¬ 
centages of the candidates inside the selection boxes are 99%, 
99%, 97%, and 94%, for the LBG, BRi', Vi'z', and Ri'z' selec¬ 
tions, respectively. The BX diagram shows the largest scatter 
outside the selection box, the fraction of candidates inside the 
box being 83%. The galaxy clearly outside the selection boxes 
in the bottom right corners of the bottom diagrams is the same 
one in both diagrams. It is a very faint object, and even though 
it is brighter than m = 24 in the first filter redwards of the Ly-a 
(the magnitude being m = 23.8), in all the other filters it is fainter 
than the 5cr limiting magnitude for the corresponding filter. 

5. Probabilistic approach 

The selection of the clean sample above is an example of the 
use of zPDFs when one needs a candidate selection and wants 
to be certain that the selected galaxies really are at desired red¬ 
shift. However, selecting both a clean and complete sample is 
challenging. If one would like to have a more complete sample, 
one could relax selection criterion (|3j. However, relaxing it, for 
example, to allow all the galaxies with a probability > 50% to be 
at high redshift to enter the sample would automatically lead to a 
contamination rate of < 50% (assuming the flat prior assumption 
is correct). Hence, for any statistical study one should carefully 
take care of the incompleteness and contamination corrections. 

For many purposes the candidate selection is not needed, 
but the galaxies and their properties can instead be consid¬ 
ered as continua described by their zPDFs. For each catalogued 
ALHAMBRA object, a zPDF is provided. For some galaxies, as 
for many objects in our clean sample, this distribution is narrow 
and could be approximated by a Gaussian distribution without 
losing much information. However, in other cases the distribu¬ 
tion is much more spread out and/o r is double peaked. This issu e 
was recently discussed in detail bv lLopez-Saniuan et ali ( 2014 ). 
In Fig. [7] we show an example of a two-peaked and a spread 
out distribution. Now, if we claimed the galaxy of Fig. |7]to be 
at any certain redshift bin z i - Zi we would certainly fail (un¬ 
less this bin were wide enough to cover the whole range where 
the PDF(z) > 0). However, for statistical purposes we can in¬ 
terpret the probabilities p of equation <0* as fractions. A similar 


approach was adopted by iMcLure et ali < 2009 1 when deriving 
LBG luminosity functions. 

5 .1 . Colour-colour diagrams 

In Fig. [8] we show the distribution of the whole set of 
ALHAMBRA galaxies as density contours in the same colour- 
colour diagrams as in Fig. [6] To obtain these contours, the cat¬ 
alogue was cleaned as explained in Sect. 14.11 In addition, good 
quality photo metry was required i n all th e filters. All the objects 
were fitted by [Bruzual & Charlotl d2003l) SSP models and con¬ 
volved with the broadband filter passbands to find their broad¬ 
band colours as in Sect. 14.41 Finally, only the galaxies brighter 
than m\j\j - 24 in the first filter redwards from the Ly-a, and 
of good quality SED-fitting (x, < 2), were used for the anal¬ 
ysis (105280 objects). The zPDF of each object was integrated 
within the redshift interval targeted by each colour-colour dia¬ 
gram (Eq.[2}. To create the density plot in Fig. [6] each object was 
weighted by this fraction. We see that while most of the density 
of the galaxies lie within the boundaries of the dropout selection 
boxes, there are also galaxies outside the boxes. The percent¬ 
ages of galaxies outside the boxes are for the BX, LBG, BRi', 
Vi'z' , and Ri'z' selections, respectively, 35%, 39%, 46%, 37%, 
and 39%, i.e. more than one third of the restframe UV bright 
galaxies would be missed by these selections. The existence of 
galaxies outside the selection boxes supports the known fact that 
the dropout selections are not complete. A recent spectroscopic 
study of high redshift galaxies in VUDS survey ( Le Fevre et ali 
12014h also demonstrates the existence of high redshift galax¬ 
ies outside the //(//(-selection box, albeit finding a smaller per¬ 
centage than we did (20%) of galaxies in the redshift range 
2.5 < z < 3.5 outside the box. On the other hand a similar study 
in the VVDS survey (iLe Fevre et al.ll20l3h reveals that 46% of 
the galaxies at the redshift range 2.7 < z < 3.4 and with a ‘reli¬ 
able’ spectral flag are outside the //(/’/(-selection box, while 17% 
of those with a ‘very reliable’ flag are located outside the box. 
Of these two surveys, our selection function resembles more that 
of the VVDS survey (pure magnitude selection) than that of the 
VUDS where a photometric redshift selection was also carried 
out. 

5.2. Number counts 

In order to obtain the number N of objects in a redshift bin 
Z i < z < Zi and magnitude bin m\ < m < mo, we carried out 
a summation over all the objects i in the cleaned ALHAMBRA 
catalogue of the form 

N = £ f PDF,(z)dz. (6) 

m\<rrii<m2 ^ Zl 

For each redshift bin the apparent magnitude refers to the mag¬ 
nitude at the UV continuum as measured by the first filter red¬ 
wards from the Ly-a (and not containing the possible Ly-a line) 
at the corresponding redshift. The summation was carried out in 
five redshift bins. The redshift bins were selected inside the red¬ 
shift range we consider reliable in our ALHAMBRA data (see 
Sect. 13.1 1 >. i.e. 2.2 < z < 5.0, and we opted for a bin width of 
A z = 0.6 to mimic the typical redshift ranges of dropout selected 
LBGs with which we compare our resulting number counts (see 
the references in the next paragraph). The resulting probabilistic 
number counts in bins of 0.5 mag and in redshift bins centred at 
z = 2.5,3.0,3.5,4.0, and 4.5 in a total area of 8572.5 arcmin 2 are 
shown in Fig. [9] These counts are also listed in Table[3] 
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Fig. 8. Density of the ALHAMBRA high redshift galaxies in four colour-colour diagrams used for traditional dropout selections. 
The densities are derived using our probabilistic approach. The selection boxes in each diagram are shown with dashed lines, and 
the redshift ranges they target are indicated in each panel. The contours enclosing 20%, 40%, 60%, 80%, and 90% of the objects are 
marked as solid lines (dashed lines for the BX selection). See the text for more details. [A colour version of this figure is available 
in the online edition .] 


This method implicitly takes into account both the incom¬ 
pleteness and contamination issues. However, this method also 
suffers from quasar contamination as these objects are not con¬ 
sidered by the BPZ. We estimated the maximum quasar contam¬ 
ination rate in the same way as in Sect. 14.3.31 above. We car¬ 
ried out the summation (|6]» over all the magnitudes and from 
Z\ = 2.2 to 7.2 = 3.5 for all non-stellar ALHAMBRA quasars 
with spectroscopic redshift z > 2.2 (19 out of 50 quasars). 
This summation gives 8.27, i.e. 8.27/50 = 0.165 = 17% of 
the high-redshift quasars contaminate our counts. A similar ex¬ 
ercise for all non-stellar ALHAMBRA quasars with spectro¬ 
scopic redshift z < 2.2 (64 out of 155 quasars) leads to 2.48, 
i.e. 1.6% of lower redshift quasars contaminating our counts. 
On the other hand, the total number of ALHAMBRA galaxies 
brighter than m = 24 in the redshift range 2.2 < z < 3.5 given 
by Eq. [6] is 5269.5. Following the prescription in Sect. 14.3.31 
this leads to a maximum contamination by high-redshift quasars 
of 0.165 x 626/5269.5 = 0.019, i.e. < 2%. The total number 
of ALHAMBRA galaxies brighter than m = 24 in the redshift 
range 2.2 < z < 5.0 given by Eq.[6]is 5680.9. Hence, the lower- 
redshift quasars add an additional maximum contamination rate 
of 2 x 99.6 deg~‘ x 2.38 deg x 0.019/5680.9 = 0.0016 < 0.2%. 

For comparison, in Fig. [3 we have also plotted the dropout 
selected BX and LBG candidates of iReddv et~alj (20081 R08), 
and the BRi' and Vi' z' dropout selected LBG candidates of 
lYoshida et al . (20061 Y06). According to R08, their samples are 
centred at redshifts z ~ 2.20 + 0.32 (BX) and z ~ 2.96 + 0.29 
(LBG). The LBG samples of Y06 are centred at z ~ 4.0 + 0.3 
(BRi') and z ~ 4.7 + 0.3 (Vi' z'). We have also overplotted 
in Fig. [9] the z ~ 4 and z ~ 5 dropout selected LBGs of 
IBouwens et al. ( 2014 B14). In our first redshift bin in Fig. [9] 


(z = 2.5 + 0.3) we have plotted both the BX and LBG candidates 
of lReddv et al.l (2008) as our redshift bin is actually in between 
the redshift ranges targeted by these two selections. 

IBouwens et al.l ( 2014 ) lists the surface densities and their er¬ 
rors in a table (Table 6 in B14) and we have plotted them in 
Fig. [9] The plotted errors for the Y06 and R08 samples reflect 
the Poisson errors, and we have corrected the Y06 and R08 
counts for incompleteness and contamination according to the 
information given in the corresponding articles: Y06 have stud¬ 
ied the completeness and contamination of their sample by sim¬ 
ulations. They list the expected number of interlopers for each 
redshift selection and magnitude bin in tables while the com¬ 
pleteness vs. redshift is given in graphic form for each magni¬ 
tude bin. For each magnitude bin we opted to adopt the maxi¬ 
mum completeness from the distribution for the corresponding 
magnitude bin. The spectroscopic sample of R08 gives the ex¬ 
pected contamination rate for each magnitude and redshift bin, 
while R08 studied the completeness of their sample (limited to 
Mab (1700A)< -19.33) by simulations and found that ~ 58% of 
the restframe UV-bright galaxies in the redshift bin 1.9 < z < 2.7 
fulfil the BX colour selection criteria while ~ 47% of the sim¬ 
ilar galaxies in the redshift bin 2.7 < z < 3.4 fulfil the LBG 
colour selection criteria. In other words, they would expect to 
find ~ 42% and ~ 53% of these galaxies outside the BX and 
LBG selection boxes, respectively, quite higher fractions than 
our estimate in Sect. l5.1l above. 

Detailed comparison of our counts with the counts derived 
from dropout selections is not straightforward. The dropout se¬ 
lections target a certain redshift range, but a fraction of galaxies 
from a much wider range of redshift can enter the selections. 
For example, the BX selection of R08 targets the redshift range 
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Fig. 9. Observed number counts for high redshift ALHAMBRA galaxies (crosses). The error bars reflect Poisson errors. For com¬ 
parison, we show the BX (z~ 2.20 + 0.32, filled triangles) and LBG (z~ 2.96 + 0.29, filled squares) number counts of Reddy et 
al. (2008; R08), the BRi' (z~ 4.0 + 0.3, open circles) and Vi'z' (z~ 4.7 + 0.3, filled circles) LBG number counts of Yoshida et al. 
(2006; Y06), and the ~ 4 (open triangles) and ~ 5 (open inverted triangles) LBG number counts of Bouwens et al. (2014; B14). 
The ALHAMBRA limiting magnitude is marked at m = 24 with a blue dashed line. See the text for more details. [A colour version 
of this figure is available in the online edition.] 


Z ~ 2.20 + 0.32, but the spectroscopic redshift distribution of 
the galaxies entering the sample, and not considered as contam¬ 
inants, varies from z ~ 1.4 to z ~ 3.4. Our methodology sim¬ 
ply targets the adopted redshift range. The dropout selections 
rely on contamination and incompleteness corrections, while our 
methodology takes these into account implicitly. Despite these 
differences, the general trends of our counts and the counts from 


literature coincide. However, in the two lowest redshift bins 
(centred at z = 2.5 and z = 3.0) there is a clear difference be¬ 
tween the brightest end of our counts and the brightest bin of 
R08 counts. The last bin of R08 is wide (from m = 19 to m = 22) 
and we can expect that the counts inside the bin are dominated 
by the fainter objects. We have plotted their brightest point at the 
centre of this bin, which slightly exaggerates the difference be- 
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Fig. 10. Observed probabilistic number counts for high redshift ALHAMBRA galaxies. The ML counts are shown as crosses, the 
FP counts as open squares, and the counts derived from an ”Odds ” selected sample are shown as open circles. The error bars reflect 
Poisson errors. The ALHAMBRA limiting magnitude is marked at m = 24 with a blue dashed line. See the text for more details. [A 
colour version of this figure is available in the online edition .] 


tween our counts and their counts. However, this is not enough 
to explain the difference. We do not know where this difference 
comes from, but we note that our sampling at the brightest end 
is clearly better which inclines us to consider our counts more 
reliable. 

Finally, we want to note that in all the redshift bins our counts 
offer a good sampling of the bright end of the surface densities, 
down to the magnitudes m = 21 - 22. It is also remarkable that 
according to our counts the total number of ALHAMBRA galax¬ 


ies brighter than m\j\ = 24 and at redshifts as high as ~ 4.0 and 
~ 4.5 is several hundreds, 406 and 348, respectively. 


5.3. The assumption of a flat prior 
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Table 3. Probabilistic number counts per magnitude bin at each redshift bin. The total area considered here is 8572.5 arcsec 2 . 


Magnitude range 

N (z = 2.5 ± 0.3) 

N (z = 3.0 ± 0.3) 

N (z = 3.5 ± 0.3) 

N (z = 4.0 ± 0.3) 

N (z = 4.5 ± 0.3) 

17.0-17.5 

0.0 

0.0 

0.1 

0.1 

0.0 

17.5 - 18.0 

0.0 

0.0 

0.0 

0.0 

0.1 

18.0-18.5 

0.0 

0.9 

0.0 

0.0 

0.0 

18.5 - 19.0 

0.0 

0.0 

1.0 

1.0 

0.1 

19.0-19.5 

0.1 

0.1 

0.1 

1.1 

1.1 

19.5 - 20.0 

2.0 

1.2 

0.0 

0.8 

2.6 

20.0 - 20.5 

1.1 

0.3 

0.1 

2.2 

2.9 

20.5-21.0 

2.9 

2.8 

0.0 

10.4 

0.4 

21.0-21.5 

2.5 

7.2 

1.5 

1.9 

6.7 

21.5-22.0 

13.4 

19.3 

1.1 

3.0 

3.9 

22.0 - 22.5 

48.2 

47.9 

9.6 

11.0 

8.2 

22.5 - 23.0 

171.4 

169.2 

35.9 

16.9 

26.3 

23.0 - 23.5 

507.8 

442.5 

167.6 

76.4 

74.7 

23.5 - 24.0 

1549.4 

1268.7 

657.2 

281.2 

221.4 

24.0 - 24.5 

2950.7 

2499.5 

1742.8 

819.0 

703.0 

24.5 - 25.0 

3893.4 

3273.7 

2694.3 

1465.7 

1231.0 

25.0 - 25.5 

3576.8 

2744.6 

2449.2 

1575.9 

1353.0 

25.5 - 26.0 

2382.2 

1677.5 

1531.0 

1096.2 

924.4 



Fig. 11. The fraction of galaxies with Odds > 0.3 as a function 
of F814W magnitude for different redshift bins: 0.4 < z < 1 
(black line), z = 2.5 + 0.3 (blue line), z = 3.0 + 0.3 (green line), 
z = 3.5 + 0.3 (magenta line), z = 4.0 ± 0.3 (red line), and z = 
4.5 + 0.3 (cyan line). [A colour version of this figure is available 
in the online edition .] 


means that variation in the density of galaxies as a function of 
redshift was not considered when deriving the zPDFs. This, in 
turn, could lead to net contribution of objects from the denser 
redshift bins to the less dense ones caused by the galaxies with 
badly defined, flat zPDFs. To test the possible effect of this on 
our number counts, we carried out two tests. 

First, we derived the counts using a slightly different ap¬ 
proach. For each object in the cleaned ALHAMBRA catalogue 
we calculated its ML Odds (see Sect. [4j integrating the ML zPDF 
within the range z m i +0.0125(1 +z m i), where z m i refers to the red¬ 
shift of the highest peak of the distribution. Then we eliminated 
the galaxies with a low ML Odds value in order to discard the ob¬ 
jects with flat zPDFs. To do this, we opted to set ML Odds > 0.3. 
For these galaxies, we derived their redshift distribution for each 
magnitude bin and each filter in Fig. [9] by summing their ML 
zPDFs. Next, we scaled these redshift distributions to the total 
number of ALHAMBRA objects in each magnitude bin. From 


these distributions, we derived the number counts for each red¬ 
shift bin of interest. 

Second, despite the possible problems the Bayesian prior 
could cause at the bright end, as was shown in Sect. 13.11 we 
tested how probabilistic number counts would turn out if the FP 
zPDFs were used. The Bayesian prior takes into account the ex¬ 
pected number density variations with redshift and should thus 
take care of the possible net contributions caused by badly de¬ 
fined, flat zPDFs. 

The resulting number counts from the two experiments de¬ 
scribed above are plotted in Fig. [TO] together with the counts 
derived directly by integrating the ML zPDF of each object. 
Interestingly indeed, we see that the ML and FP counts roughly 
coincide. This means that the influence of the prior in the 
ALHAMBRA high redshift galaxy zPDFs is not significant; i.e. 
thanks to the ALHAMBRA multifilter system, the ML method 
alone is capable of recovering the galaxy redshifts. We do not 
have enough objects at the very brightest ends of the counts to 
study if the prior works well there. For this we need to wait for 
data from larger area surveys. 

We see that at the brightest end the results of the Odds exper¬ 
iment coincide with the direct ML counts. At the fainter magni¬ 
tudes, the Odds derived counts tend to be lower than the ML (and 
FP) counts for the two lowest redshift bins (centred at z = 2.5 
and z — 3.0); in the two following bins (centred at z — 3.5 and 
Z = 4.0) all counts coincide in all magnitudes (up to the limit¬ 
ing magnitude), and in the last bin (centred at z = 4.5) the Odds 
derived counts tend to be higher in the faintest magnitudes than 
the ML/FP counts. This could mean that at the lower redshift 
bins and fainter magnitudes a net contribution from low redshift 
galaxies with flat zPDFs affects our counts and tends to overes¬ 
timate them, while at the brightest magnitude bins the effect is 
the opposite. However, considering that the ML and FP counts 
do agree in these magnitude bins, we do not believe this is the 
case. The same effect can be obtained if a smaller number of 
high redshift galaxies have good Odds values at the first redshift 
bins than the lower redshift galaxies, and the opposite would be 
true for the last redshift bin. 

To study this in greater de tail, we derived the Odds sam pling 
rate (OSR) as introduced in iLopez-Saniuan et al . (i2014h . This 
gives the fraction of galaxies with good Odds values (in this case 
Odds > 0.3) to the total number of galaxies as a function of 
magnitude in the detection filter, F814W. In Fig. [IT] we show 
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the OSR vs. magnitude derived for 0.4 < z < 1 as a reference 
curve for lower redshift galaxies, and the corresponding curves 
for the redshift bins that we are interested in: z = 2.5 + 0.3,3.0 ± 
0.3,3.5 + 0.3,4.0 ± 0.3, and 4.5 ± 0.3. We see that, at magnitudes 
fainter than ~ 20, the OSR indeed depends on redshift, being 
lowest for our lowest redshift bin and systematically increasing 
with redshift, the OSR of our highest redshift bin being higher 
than that of the reference curve. Actually, this behaviour is also 
visible in the recovered summed zPDFs of our simulated high 
redshift galaxies in Sect. 13. II In Fig.|3]we see how the recovered 
summed zPDF becomes narrower with increasing redshift. 

To summarise, deriving the galaxy redshift distribution from 
an Odds selected sample should be considered with caution as 
at high redshift OSR strongly depends on redshift. Luckily, we 
do not need to rely on such an approach as, despite our worries 
about the use of a prior in our high redshift study (Sect. [3TTJ- the 
prior does not seem to influence our counts significantly; both 
ML and FP zPDFs give similar results. As the prior takes into 
account the varying galaxy density with redshift, and the FP and 
ML counts coincide, there clearly is no significant net contribu¬ 
tion of objects with spread out zPDFs from the denser redshift 
bins to the less dense ones. The study of the very brightest and 
noisy end of our counts (from m ~ 19 to m ~ 21 - 22) needs to 
wait for data from larger area surveys, like I-PLUS and J-PAS. 
From the number counts derived in this section, we estimate that 
these surveys will detect tens of thousands of high redshift galax¬ 
ies brighter than m = 22.5. 

6. Summary 

So far, most of the studies of the high redshift UV bright galaxy 
population have been based on dropout selections. Spectroscopic 
follow-up of dropout selected samples (e.g. iReddv et alj[2008l) 
have shown that the dropout selection suffers from severe con¬ 
tamin ation. Simulations (e.g. Yoshida et al. 2006; Rcddv ~et~aT1 
2008) and a spectroscopic study of high redshift galaxies se- 
lected from a purely flux-selected sample (Le Fevre et afll20051) 
have shown that the dropout selection is also highly incom¬ 
plete. This_is further supported by a wide spectroscopic sample 
bv lLe Fevre et al.1 (20141) , where the candidates are selected us¬ 
ing photometric redshifts. We expect an alternative probabilistic 
method, like the one presented here, would help to remove this 
kind of biases. 

We have studied the high redshift UV bright galaxy popu¬ 
lation in ALHAMBRA data adopting a novel approach based 
on redshift probability distribution functions (zPDFs). We have 
shown how a clean sample of high redshift galaxies can be de¬ 
rived from the ALHAMBRA catalogue, integrating the zPDFs 
and selecting only those galaxies with very high probability to 
be at high redshift. We studied whether this clean sample would 
be selected by the traditional dropout techniques, and basically 
all of the galaxies in our sample actually would also be selected 
by these methods at 83 - 99% levels. However, the benefit of 
our selection compared to the traditional dropout selections is 
the expected very low percentage of interlopers. 

We have also shown that our clean sample suffers from se¬ 
vere incompleteness and is not able to derive any reliable sta¬ 
tistical properties about the high redshift galaxy population. We 
have introduced a probabilistic method which takes into account 
both incompleteness and contamination in a natural way. In this 
approach, the galaxies are not treated as unities but rather as frac¬ 
tions in each redshift, where the size of this fraction is derived 
by integrating the corresponding zPDF of each galaxy at the red¬ 
shift range of interest. Using this approach, we have studied the 


distribution of the ALHAMBRA high redshift galaxies in the tra¬ 
ditional colour-colour diagrams and discovered that a significant 
percentage of them (> 30%) are located outside the traditional 
selection boxes. We have also derived the probabilistic number 
counts in five redshift bins from z = 2.5 to z = 4.5. The strength 
of our counts is the good sampling of the bright end, down to 
muv(AB)=21-22. 

In our simulation we discovered that if the Bayesian prior 
was used, very bright high redshift galaxies would systemati¬ 
cally not be selected to form part of our clean sample. For this 
reason, all the above studies are based on maximum likelihood 
(ML) zPDFs, i.e. throughout the paper we have assumed a flat 
prior. However, we tested how the probabilistic number counts 
would turn out if the full probability (FP) zPDFs were used. We 
found that the FP and ML counts closely match. This reinforces 
the reliability of these counts. To know if this holds at the very 
brightest magnitudes, where our data is dominated by noise, data 
from still larger area surveys is needed. We would like to be 
able to come back to this issue once the data from wide area 
(~ 8500 deg 2 ) J-PLUS and J-PAS multifilter surveys are avail¬ 
able. From the number counts derived in this work, we estimate 
that we could detect tens of thousands of high redshift galaxies 
brighter than ;;iuv(AB)=22.5 in these surveys. 

We also repeated the probabilistic number counts calcula¬ 
tion deriving the ML redshift distributions in each magnitude 
bin from a selection of galaxies with well-defined photometric 
redshifts (ML Odds > 0.3) and scaling these to the total num¬ 
ber of objects in each magnitude bin. In the faintest magnitude 
bins we find differences between these and the direct ML counts. 
Considering that the FP and ML counts roughly coincide in all 
magnitude bins, we inferred that the differences seen with the 
counts derived from the Odds selected sample are due to varia¬ 
tions in the fractional amount of galaxies with good Odds val¬ 
ues with redshift. We studied the evolution of this fraction as a 
function of magnitude and redshift, and found out that at faint 
magnitudes this fraction indeed varies with redshift. 

Even though we have discussed here only the application of 
the probabilistic method of deriving the galaxy number counts, a 
similar approach could be used to study any redshift dependent 
galaxy property. iMcLure et al. 1 ( 2009 ) used a similar approach to 
derive LBG luminosity functions and, recently, Lopez-Sanjuan 
et al. (2014) discussed a similar approach to study the galaxy 
merger fraction. We will further study the ALHAMBRA high 
redshift galaxies using this methodology in the forthcoming pa¬ 
pers. 

Theoretically, our probabilistic method is totally free of bi¬ 
ases due to incompleteness and contamination. However, this is 
not totally true, as our photo-z estimations are limited to what 
is already known about the galaxy population because empiri¬ 
cal templates are used. The way to improve this aspect of the 
method is to create unbiased lists of candidates, spectroscop¬ 
ically confirm them, and consequently refine the high redshift 
templates. For the unbiased candidate selection, zPDFs offer a 
unique opportunity. A wide spectroscopic campaign on candi¬ 
dates selected using zPDFs is already in progress dLe Fevre et alJ 
120141 ). Once the spectra of objects derived from unbiased sam¬ 
ples are available, these can be used to improve the photo-z esti¬ 
mations at high redshift, and subsequently improve the accuracy 
of the statistical methods like the one presented here. 
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